Search CORE

35 research outputs found

Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers

Author: Barber David
He Hangen
He Zhen
Li Jian
Liu Daxue
Publication venue
Publication date: 08/04/2019
Field of study

Online Multi-Object Tracking (MOT) from videos is a challenging computer vision task which has been extensively studied for decades. Most of the existing MOT algorithms are based on the Tracking-by-Detection (TBD) paradigm combined with popular machine learning approaches which largely reduce the human effort to tune algorithm parameters. However, the commonly used supervised learning approaches require the labeled data (e.g., bounding boxes), which is expensive for videos. Also, the TBD framework is usually suboptimal since it is not end-to-end, i.e., it considers the task as detection and tracking, but not jointly. To achieve both label-free and end-to-end learning of MOT, we propose a Tracking-by-Animation framework, where a differentiable neural model first tracks objects from input frames and then animates these objects into reconstructed frames. Learning is then driven by the reconstruction error through backpropagation. We further propose a Reprioritized Attentive Tracking to improve the robustness of data association. Experiments conducted on both synthetic and real video datasets show the potential of the proposed model. Our project page is publicly available at: https://github.com/zhen-he/tracking-by-animationComment: CVPR 201

arXiv.org e-Print Archive

Crossref

UCL Discovery

Optimal O

Author: Hangen He
Shengdong Pan
Xiangjing An
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

A number of acceleration schemes for speeding up the time-consuming bilateral filter have been proposed in the literature. Among these techniques, the histogram-based bilateral filter trades the flexibility for achieving O(1) computational complexity using box spatial kernel. A recent study shows that this technique can be leveraged for O(1) bilateral filter with arbitrary spatial and range kernels by linearly combining the results of multiple-box bilateral filters. However, this method requires many box bilateral filters to obtain sufficient accuracy when approximating the bilateral filter with a large spatial kernel. In this paper, we propose approximating arbitrary spatial kernel using a fixed number of boxes. It turns out that the multiple-box spatial kernel can be applied in many O(1) acceleration schemes in addition to the histogram-based one. Experiments on the application to the histogram-based acceleration are presented in this paper. Results show that the proposed method has better accuracy in approximating the bilateral filter with Gaussian spatial kernel, compared with the previous histogram-based methods. Furthermore, the performance of the proposed histogram-based bilateral filter is robust with respect to the parameters of the filter kernel

Crossref

Directory of Open Access Journals

Multimedia Data Modelling Using Multidimensional Recurrent Neural Networks

Author: Daxue Liu
Hangen He
Liang Xiao
Shaobing Gao
Zhen He
Publication venue: 'MDPI AG'
Publication date: 01/09/2018
Field of study

Modelling the multimedia data such as text, images, or videos usually involves the analysis, prediction, or reconstruction of them. The recurrent neural network (RNN) is a powerful machine learning approach to modelling these data in a recursive way. As a variant, the long short-term memory (LSTM) extends the RNN with the ability to remember information for longer. Whilst one can increase the capacity of LSTM by widening or adding layers, additional parameters and runtime are usually required, which could make learning harder. We therefore propose a Tensor LSTM where the hidden states are tensorised as multidimensional arrays (tensors) and updated through a cross-layer convolution. As parameters are spatially shared within the tensor, we can efficiently widen the model without extra parameters by increasing the tensorised size; as deep computations of each time step are absorbed by temporal computations of the time series, we can implicitly deepen the model with little extra runtime by delaying the output. We show by experiments that our model is well-suited for various multimedia data modelling tasks, including text generation, text calculation, image classification, and video prediction

Directory of Open Access Journals

Vision Sensor-Based Road Detection for Field Robot Navigation

Author: Hangen He
Jian Li
Keyu Lu
Xiangjing An
Publication venue: MDPI AG
Publication date: 01/11/2015
Field of study

Road detection is an essential component of field robot navigation systems. Vision sensors play an important role in road detection for their great potential in environmental perception. In this paper, we propose a hierarchical vision sensor-based method for robust road detection in challenging road scenes. More specifically, for a given road image captured by an on-board vision sensor, we introduce a multiple population genetic algorithm (MPGA)-based approach for efficient road vanishing point detection. Superpixel-level seeds are then selected in an unsupervised way using a clustering strategy. Then, according to the GrowCut framework, the seeds proliferate and iteratively try to occupy their neighbors. After convergence, the initial road segment is obtained. Finally, in order to achieve a globally-consistent road segment, the initial road segment is refined using the conditional random field (CRF) framework, which integrates high-level information into road detection. We perform several experiments to evaluate the common performance, scale sensitivity and noise sensitivity of the proposed method. The experimental results demonstrate that the proposed method exhibits high robustness compared to the state of the art

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central